Objective Speech Quality Estimation using Gaussian Mixture Models
نویسندگان
چکیده
In this thesis, we propose the use of Gaussian mixture models (GMMs) as simple, yet effective predictors of perceived speech quality. A large pool of perceptual distortion features is extracted from speech files. Initially, statistical data mining algorithms are used to sift out the most relevant variables from the pool. We show that the five most salient feature variables are sufficient to construct good GMM-based estimators of subjective listening quality. It is shown, however, that the features selected by the data mining schemes limit the performance of the proposed voice quality predictor. To this end, a novel feature selection algorithm that directly optimizes GMM prediction performance is also proposed. The algorithm performs N -survivor search, trading complexity and accuracy via the parameter N . Comparisons with PESQ, the current “state-of-art” speech quality estimation algorithm, show that the proposed algorithm incurs, on average, 26.12% higher correlation and 18.04% lower root-mean-squared error. Tested on unseen data the proposed algorithm is capable of reducing RMSE by an average 41% relative to PESQ.
منابع مشابه
Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering
Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...
متن کاملSpeech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty
In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...
متن کاملSpeech quality estimation using Gaussian mixture models
We propose a novel method to estimate the quality of coded speech signals. The joint probability distribution of the subjective mean opinion score (MOS) and perceptual distortion feature variables is modelled using a Gaussian mixture density. The feature variables are sifted from a large pool of candidate features using statistical data mining techniques. We study what combinations of features ...
متن کاملObjective Speech Quality Assessment Using Gaussian Mixture Models
Objective speech quality assessment algorithms provide low-cost and online monitoring of voice calls, replacing costly and timeconsuming subjective listening tests. We propose a novel approach to objective speech quality measurements using Gaussian mixture models (GMMs). A large pool of perceptual distortion features is extracted from speech files and multivariate adaptive regression splines (M...
متن کاملIMAGE SEGMENTATION USING GAUSSIAN MIXTURE MODEL
Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we have learned Gaussian mixture model to the pixels of an image. The parameters of the model have estimated by EM-algorithm. In addition pixel labeling corresponded to each pixel of true image is made by Bayes rule. In fact, ...
متن کامل